Effectively long-distance dependencies in French : annotation and parsing evaluation
نویسنده
چکیده
We describe the annotation of cases of extraction in French, whose previous annotations in the available French treebanks were insufficient to recover the correct predicate-argument dependency between the extracted element and its head. These cases are special cases of LDDs, that we call effectively longdistance dependencies (eLDDs), in which the extracted element is indeed separated from its head by one or more intervening heads (instead of zero, one or more for the general case). We found that extraction of a dependent of a finite verb is very rarely an eLDD (one case out of 420 000 tokens), but eLDDs corresponding to extraction out of infinitival phrase is more frequent (one third of all occurrences of accusative relative pronoun que), and eLDDs with extraction out of NPs are quite common (2/3 of the occurrences of relative pronoun dont). We also use the annotated data in statistical dependency parsing experiments, and compare several parsing architectures able to recover non-local governors for extracted elements.
منابع مشابه
PLCFRS Parsing of English Discontinuous Constituents
This paper proposes a direct parsing of non-local dependencies in English. To this end, we use probabilistic linear context-free rewriting systems for data-driven parsing, following recent work on parsing German. In order to do so, we first perform a transformation of the Penn Treebank annotation of non-local dependencies into an annotation using crossing branches. The resulting treebank can be...
متن کاملMulti-lingual dependency parsing evaluation: a large-scale analysis of word order properties using artificial data
Fair comparative performance evaluation is one of the difficulties for work on multilingual parsing. The differences in parsing performance can be the result of disparate properties of treebanks (such as their size or average sentence length), choices in annotation schemes, and the linguistic properties of languages. We propose a method to tease apart the effects of these factors in parsing per...
متن کاملEvaluating Dependency Parsing: Robust and Heuristics-Free Cross-Annotation Evaluation
Methods for evaluating dependency parsing using attachment scores are highly sensitive to representational variation between dependency treebanks, making cross-experimental evaluation opaque. This paper develops a robust procedure for cross-experimental evaluation, based on deterministic unificationbased operations for harmonizing different representations and a refined notion of tree edit dist...
متن کاملEvaluation of Two-level Dependency Representations of Argument Structure in Long-Distance Dependencies
Full recovery of argument structure information for question answering or information extraction requires that parsers can analyse long-distance dependencies. Previous work on statistical dependency parsing has used post-processing or additional training data to tackle this complex problem. We evaluate an alternative approach to recovering long-distance dependencies. This approach uses a two-le...
متن کاملThe Ongoing Evaluation Campaign of Syntactic Parsing of French: EASY
This paper presents EASY (Evaluation of Analyzers of SYntax), an ongoing evaluation campaign of syntactic parsing of French, a subproject of EVALDA in the French TECHNOLANGUE program. After presenting the elaboration of the annotation formalism, we describe the corpus building steps, the annotation tools, the evaluation measures and finally, plans to produce a validated large linguistic resourc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012